RANDOM MATRICES: THE CIRCULAR LAW 
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Abstract. Let x be a complex random variable with mean zero and bounded 
variance cr^ . Let Nn be a random matrix of order n v^ith entries being i.i.d. 
copies of X. Let Ai, . . . , A„ be the eigenvalues of — ^^Nn- Define the empirical 
spectral distribution fi,i of N„ by the formula 

/x„(s,t) := -#{k < n\Re{\k) < s;Im(Afc) < t}. 
n 

The following well-known conjecture has been open since the 1950's: 
Circular law conjecture: fin converges to the uniform distribution fi 

oo over 

the unit disk as n tends to infinity. 

We prove this conjecture, with strong convergence, under the slightly stronger 
assumption that the (2 + T;)th-moment of x is bounded, for any r; > 0. 
Our method builds and improves upon earlier work of Girko, Bai, Gotze- 
Tikhomirov, and Pan-Zhou, and also applies for sparse random matrices. 

The new key ingredient in the paper is a general result about the least 
singular value of random matrices, which was obtained using tools and ideas 
from additive combinatorics. 



1. Introduction 



Let a; be a complex random variable with finite non-zero variance < cr^ < oo 
and Nn be the random matrix of order n with entries being i.i.d. copies of x. Let 
Ai, . . . , A„ be the eigenvalues of ^r^-^n- Define the empirical spectral distribution 
(ESD) fin of Nn by the formula 

Hn{s,t) := -#{k < n\Re{\k) < s;lm{Xk) < t}. 

We say that the (strong) circular law holds for x if, with probability 1, the spectral 
distribution /i„ converges (uniformly) to the uniform distribution 

fioois,t) := -mes({z G C||z| < l;Re(z) < s;Im(2) < t}) 

over the unit disk as n tends to infinity. In the literature one also sees the weak 
circular law, which asserts that for any fixed s and t, that pLn{s,t) converges to 
/ioo(s,t) in probability. 

As the name suggests, the weak circular law is easier to prove than the strong one. 
Using the approach in [2], the proofs of both types of convergence boil down to 
controlling the least singular value of —y=Nn — zl . For the weak convergence, one 
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needs a bound with failure probability tending to zero with n tends to infinity (this 
is the approach taken in [11, 12], for example). On the other hand, for the strong 
convergence one needs the failure probability be summable in n. This appears much 
more difficult and we will discuss it in more detail in Section 2 (see the paragraph 
following Theorem 2.1). 

In this paper we shall be concerned exclusively with the strong circular law, and in 
particular with regard to the following well-known conjecture: 

Circular law conjecture. The strong circular law holds for any complex variable x 
with zero mean and finite non-zero variance. 

The circular law conjecture was formulated in the early 1950s, as a natural (non- 
hermitian) counterpart of Wigner's semi-circle law. Since then, several partial 
results have been obtained, at the cost of extra assumptions on the distribution of 
the basic variable x. In the next few paragraphs, we give a brief survey of these 
results. 

If X is complex Gaussian, the conjecture was proved by Mehta [18] in 1967, using 
the joint density function of the eigenvalues A, which was discovered by Ginibre few 
years earlier [13]. An important breakthrough was made by Bai [1], following an 
earlier work of Girko [9] . (Bai's paper discussed Girko's paper carefully and pointed 
out some gaps in that paper.) In [1], Bai proved the claim under the assumption 
that X has finite sixth moment (Ejxj^ < oo) and that the joint distribution of the 
real and imaginary parts of x has a bounded density. Recently, in [2, Chapter 
10], a finer result was obtained showing that the sixth moment hypothesis can be 
weakened to E|x|^+'' < oo for any specified > 0. However, the bounded density 
assumption remains critical. This assumption, unfortunately, excludes several im- 
portant distributions, for instance discrete distributions such as Bernoulli random 
variables x € {—1, +!}• 

Theorem 1.1. [2, Theorem 10.3] Assume that the complex random variable x 
has zero mean and finite (2 -|- r/)*^ moment for some rj > and also that the joint 
distribution of the real and imaginary part has a bounded density. Then the circular 
law holds for x. 

A key idea in [1] is to analyze the ESD through its Stieltjes transformation 
s„ : C ^ C, defined by the formula^ 



As Sn{z) is analytic everywhere except the poles, the real part already determines 
the eigenvalues A^. If write Sn{z) = Snr{z) + \/^Sni{z), Xk = Xkr + \/—^Xki and 
z = s + V—lt, we have the important identity 



We are using \/— T for the imaginary unit, as we wish to reserve i as an index of summation. 




THE CIRCULAR LAW 



3 



fc=l fc=l 

where f„(., z) is the ESD of the Hermitian matrix iJ„ := (^Nn — zI){^Nn — zl)* . 
The task then reduces (at least in principle) to controlling the distributions 

The log function has two poles, at oo and 0. The first one is easy to deal with, 
as one can bound the largest singular value by a polynomial in n. The pole at 
poses a much more serious obstacle, since the smallest eigenvalue of Hn (or the 
least singular value of Nn — zl) can be arbitrary close to 0. (In fact, if the matrix 
is singular, which happens with positive probability in discrete models, then the 
least singular value is 0.) The bounded density assumption in Theorem 1.1 was 
introduced primarily in order to handle this obstacle. 

In the last few years, the least singular value problem has become better understood 
in the discrete case, thanks to a series of papers [27, 20, 21, 28]. In these papers, 
strong lower bounds for the least singular value of a random matrix [27, 20, 21] or 
a random perturbation of a fixed matrix [28] were obtained. As a consequence, the 
circular law has recently been established for various new classes of distributions. 
For instance, Gotze and Tikhomirov [11] proved the weak circular law for any sub- 
Gaussian^ distribution x, using the arguments from [20]. In [10], Girko established 
the weak circular law assuming bounded 4 + 5 moment for some S > 0. Relying 
on [21], Pan and Zhou [19] were recently able to verify the strong circular law for 
any distribution with a bounded fourth moment. This assumption is needed for a 
number of reasons, in particular allowing one to bound the operator norm of Nn by 
0{y/n) with high probability. Very recently (a few months after the current paper 
was first posted on the arXiv), Gotze and Tikhomirov [12] proved the weak circular 
law under an assumption similar to our main theorem below. 

In this paper, we prove the circular law only assuming a bounded (2 + 77)**^ moment, 
for any fixed 77 > 0. In particular, we have completely removed the bounded density 

function assumption in Theorem 1.1. 

Theorem 1.2 (Circular law). Assume that x is a complex random variable with 
zero mean and finite (2+77)**^ moment for some ri > 0, with strictly positive variance. 
Then the strong circular law holds for x. 

This result can be further strengthened in several directions: 

• We can further relax the condition E|.x|^+'' < 00 to Eja;]^ log'"(2+|a;|) < 00, 
where C is a sufficiently large absolute constant. (For instance, C = 16 is 
sufficient; see Section 13 for details.) 

• It is not necessary to assume that the entries have identical distributions. It 
suffices to assume that they are independent, have mean zero with uniformly 



A variable is sub-Gaussian if it has exponential tail; in particular all of its moments are 
bounded. 
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bounded (2 + ry)*'' moments, and that they arc all dominated (in a Fourier- 
analytic sense) by a single random variable with finite non-zero variance 
and bounded (2-1-77)*'' moment; see Remark 2.8 below. (See also [2, p. 326- 
327] in which the extension of the circular law to the case of non-identical 
distributions is discussed.) 
• One can obtain some quantitative estimates on the rate of convergence as 
well. For example, under the (2 -|- 77)*'^ moment assumption, we can show 
that almost surely, the distance sup^ ^ |/in(s, t) — ij.oo{s, t)\ between /i„ and 

the limiting distribution /ioo in the uniform metric is at most n~'^ for some 
constant r/' > and all sufficiently large n. 

It is too technical to address these points in the main proof, so we are going to 
first prove Theorem 1.2 and sketch out the necessary modifications to obtain these 
refinements in Sections 13, 14. 

The circular law also holds for sparse random matrices. For < /i < 1, let 1^ be 
the boolean random variable which takes value 1 with probability /x and with 
probability 1 — /U. Let p = n~^+", for a positive constant a. Let Nn,p be the 
random matrix with the ij entry being li_j^pXi_j, where the lij.p and Xij arc jointly 
independent iid copies of Ip and x respectively. Gotze and Tikhomirov [11] proved 
that if X is sub-Gaussian and a > 3/4, then Nn,p admits the circular law. We can 
prove the following strengthening of this result: 

Theorem 1.3 (Circular law for sparse matrices). Let a > and rj > be arbitrary 

positive constants. Assum,e that x is a complex random variable with zero m,ea,n and 
finite (2-1-77)*'' moment. Set p = 7i~^+" and let pn,p be the ESD of ^^^ Nn,p, where 

(7^, as usual, is the variance of x. Then iin,p tends to the uniform distribution /Xoo 
over the unit disk as n tends to infinity. 

Remark 1.4. If one takes a = 0, the circular law no longer holds. In this case, 
p = and each entry equals with probability 1 — 1/n. Thus, a row is all-zero 
with probability (1 — I/71)" « e~^. Since the rows are independent, it is easy to 
show that with high probability one has 0(n) all-zero rows. But this means that 
the ESD, with high probability, has positive constant mass at the origin. 

We shall prove this theorem in parallel with Theorem 1.2, by indicating at various 
junctures what the "sparse" version of certain key lemmas are. 

The key ingredient in our proof of the circular law is a new lower bound for the 
least singular value of the matrix M -\- N^, where M is an arbitrary matrix with 
complex entries having absolute values bounded from above by a polynomial in 
n. For the circular law, we only need to consider the case M = —zl, where / is 
the identity matrix. On the other hand, the general case is interesting on its own 
right and proves useful in other areas of mathematics (see, for example, [28]). Our 
arguments permit the coefficients of M or 7V„ to be as large as 77'-^ for any fixed 
constant C, which is the main reason why we do not need any stronger moment 
control on x beyond the (2 -|- 77)*'' moment. 
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The rest of the paper is organized as follows. In the next section, we present the 
above mentioned result on the least singular value. The key tool for proving this 
result is a so-called Inverse Littlewood-Offord theorem, discussed in Section 3. This 
theorem is motivated by several previous results of the same spirit from [27]. On 
the other hand, the bound in Section 3 is nearly optimal and is sharper than one 
that can be deduced from [27]. This improvement is critical to us. 

The proof of the Inverse Littlewood-Offord theorem is technical and requires several 
lemmas, developed in Sections 4-9. In particular, we prove a forward Littlewood- 
Offord theorem (Theorem 6.6), which seems to be of interest on its own right. The 
proof of the Inverse Theorem follows in Section 10. Next, we prove the desired 
bound on the least singular value in Section 11. The proof of the circular law 
follows in Section 12. The rest of the paper is devoted to various refinements of the 
circular law; for instance. Theorem 1.3 is discussed in Section 15. 

In order to handle the sparse case, we need sparse versions of all the tools men- 
tioned above. These results can bo proved using the same argument with some 
modifications. We will only sketch these proofs in the paper. 

Let us conclude this section with our notation. 

Definition 1.5 (Asymptotic notation). In the whole paper wc assume that n is 
sufficiently large, whenever needed. Asymptotic notation is used under the assump- 
tion that n — > oo. Let X and Y be non-negative quantities. X = 0{Y), X <tiY, 
Y ^ X and Y = il{X) all mean that X < CY for some positive constant C and 
X = e(F) means X <^ Y <^ X; X = o{Y) means that jXj < c{n)Y where c(n) 
goes to zero as n — > oo. 

In many cases, we want to indicate that the hidden constants in 0,Q,& or -C, » 
depend on some additional parameters. In such cases, we will indicate this by 
subscripts. For instance. X = Oe(Y) means that there is a positive constant C{s) 
depending only on e such that X < C{£)Y. 

Throughout the paper, letters A,B,C,c,a,e,r],S,K are used to denote constants. 
Letters fi, p, (3 denote quantities that may depend on n. 

We use P to denote probability, E to denote expectation, and Ip to denote indicator 
functions of expectation p as used earlier in this section. If E is an event, we use 
I(i?) to denote the indicator of E, which equals 1 when E is true and otherwise. 
The cardinality of a finite set S will be denoted #5, and the Lebesgue measure of 
a set j4 c C will be denoted mes{A). 

2. Least singular value bound 

Let M be a matrix of order n. We use ||M|| to denote the spectral norm of M (i.e. 
the largest singular value of M) 

\\M\\ = sup |Mv|. 

|v|=l 
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As discussed in the previous section, a key point in Bai's approach is to obtain 
control on the lower tail distribution for the least singular value of -^Nn — zin, 
or equivalently to obtain control on the upper tail distribution of the norm of the 
inverse - zln)~^\\- 

This will be achieved in Theorems 2.1 below. The strength of this theorem is that 
it requires a very weak assumption on the distribution of the entries. All we need is 
a finite second moment. Several results of this type were obtained recently, under 
stronger assumptions of x. For example, [22] addressed the case when x was real 
Gaussian; [28] addressed the case when x has support on the integers and M has 
integer entries. This was done building upon the M = case discussed in [27] . The 
case when x has finite third moment and that ||M|| is bounded by Oln^^"^) was 
addressed in [19] (building upon the M = real- valued case proven in [21]). In this 
result, the assumption on the norm of M is important and the constant 1/2 (in 
the exponent of n) cannot be replaced by any other constant. Furthermore, in the 
complex-valued case, the bounds in [19] depended on the entire covariance matrix 
of X and not just on the variance. 

Theorem 2.1 (Least singular value bound). Let A, C\ he positive constants, and let 

X be a complex- valued random variable with non- zero finite variance (in particular, 
the second moment is finite). Then there are positive constants B and such that 
the following holds: if Nn is the random matrix of order n whose entries are iid 
copies of X, and M is a deterministic matrix of order n with spectral norm at most 
nP^ , then, 

P(||(M + Ar„)-i|| >n^) <C2n-^. 

It is very important that we can have any constant A in the bound, li A > 1, 
then the right hand side is summable in n and this is critical to the strong circular 
law. In order to prove the weak law, any A suffices. The difficulty between getting 
any A and getting A> \ can be illustrated by the following simplified case. Take 
M be the zero matrix and N be the random Bernoulli matrix (whose entries take 
value ±1 with probability 1/2). To make the situation even simpler, assume that 
we only want to bound the probability that does not exists (namely that N is 
singular). Already in the 70s, Komlos [4] proved that this probability is 0(n~^/^). 
However, the first proof for a bound of the type 0{n~^~^) was obtained only almost 
twenty years later by Kahn, Komlos and Szemeredi [15], using a much more complex 
argument. 

Let us now go back to Theorem 2.1. In fact, we have a more precise statement 
involving a seemingly stronger (but actually equivalent) assumption on x. More 
precisely, we introduce the following technical definition. 

Definition 2.2 (Controlled second moment). Let k > 1. A complex random 
variable x is said to have K-controlled second moment if one has the upper bound 

E|a;|^ < K 

(in particular, jEx] < k^^^), and the lower bound 

(1) ERe{zx - iufl{\x\ <k)> -Re{zf 
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for all complex numbers z,w. 

Example 2.3. The Bernoulli random variable (P(a; = +1) = P(a; = —1) = 1/2) 
has l-controUed second moment. The condition (1) asserts in particular that x has 
variance at least ^ , but also asserts that a significant portion of this variance occurs 
inside the event \x\ < k, and also contains some more technical phase information 
about the covariance matrix of Re(a;) and Im(x). 



To show that this condition is not significantly stronger than bounded second mo- 
ment, we prove that any complex random variable of finite non-zero variance has 
controlled second moment after a (harmless) phase rotation: 

Lemma 2.4. Let x he a complex random variable with finite non-zero variance. 
Then there exists a phase e^^~^^ and a k > 1 such that e^'^^x has K-controlled 
second moment. 



Proof. For k sufficiently large, wc have E|.xp < k, and the event \x\ < k has 
probability at least 1/^/71. Let x^ be the variable x conditioned on the event 
\x\ < K. Since x has non-zero variance, we see that Xk will also have non-zero 
variance for k large enough. It will then suffice to show that 

ERe{zx^ - wf > Re(^)^^ 

v 

after rotating a; by a phase if necessary. If we write := x^ — £(3;^), then we 
easily compute 

ERc(za;K - wf = 'ERe{zy^ + z'E{xi^) - wf = EReizy^f + Re(2:E(a;„) - wf 
so it suffices to show that for k sufficiently large we have 

(2) ERe{zy,f > Re{zf^. 

v 

Now set y := X — E(a;) and consider the covariance matrix 

/ ERc(y)2 ERe(y)Im(y)^ 
l^ERc(y)Im(2/) EIm(j/)2 

Since x has finite non-zero variance, we see that this matrix is finite, non-zero, 
and positive semi-definite. In particular its largest eigenvalue is at least 5 for some 
(5 > 0. By monotone convergence we then conclude that the covariance matrix 

( ERe(2/,)2 ERe{y^)lTn{yS 
^ ' VERe(2/,)Im(2/,) El^{yS 

has largest eigenvalue at least 5/2 for k sufficiently large. 



Now fix K large enough so that all the above statements hold, and also so that 
< f . The null space of (3) is at most one-dimensional. By rotating a; by a 
phase we may then assume that the null space is contained in the imaginary axis 

{ ( ^ J : w e R}. Since covariance matrices are positive semi-definite, we thus have 



5 2 



yW^ 

the quadratic form estimate 



|u^ERe(y«)^ + 2uvERe{y^)lm{y^) + v^Elm{y^f\ > -u 
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and (2) follows by setting u — Re{z) and v — Im{z). 



□ 



Since rotating all entries by the same phase does not change the norm of the inverse, 
Theorem 2.1 follows from the following theorem. 

Theorem 2.5 (Least singular value bound). Let A,C\,C2 he positive constants. 
There are positive constants B and C3 such that the following holds. Let x he a 
random variahle with C\-controlled second moment and Nn he the random matrix 

of order n whose entries are i.i.d copies of x. Let M he a deterministic matrix of 
order n with spectral norm at most n^^ . Then, 



Remark 2.6. Our arguments give an explicit dependence of B in terms of A and 
C2. One can set B to be roughly 2AC2- A more exact dependence can be obtained 
with considerably more technical details. Since for the proof of the circular law, any 
constant B sufHces, we do not go into this matter here and will discuss it elsewhere. 

Remark 2.7. Notice that the assumptions in Theorem 2.5 are weaker than the 
assumption of Theorem 1.2. We do not require x to have mean and bounded 
(2 + ?7)*^ moment. In the proof of Theorem 1.2, these extra assumptions are needed 
in order to repeat the approach of Bai, and are unrelated to the pole problem or 
Theorem 2.5. 

Remark 2.8. One can relax somewhat the hypothesis that the entries Xij of Nn are 
i.i.d copies of x. It is sufficient to assume the following 

• Xij are dominated by a single distribution x in the Fourier-analytic sense 



that |E(e2'^^^«(«^''))| < E(e2'^^^«(«^)) for ah complex numbers C 
• X has K-controUed second moment for some fixed k. 

This refinement can be extracted without too much difficulty from the proof in this 
paper, which ultimately relies on Fourier-analytic methods. Using this refinement 
and following [2, Chapter 10.8.2], we can extend Theorem 1.2 for the case the 
the entries of A^„ are independent, but not necessarily identically distributed, as 
mentioned in the introduction. 

In order to deal with sparse random matrices, we prove the following variant of 
Theorem 2.1. 

Theorem 2.9 (Least singular value for sparse matrices). Let A > l,Ci,C2,a be 
positive constants. There are positive constants B and C3 depending on ^, Ci, C2, a 
such that the following holds. Let x be a random, variable with Ci-controlled second 
moment and let Nn.p be the random, m,atrix of order n defined as in Theorem, 1.3. 
Let M he a deterministic matrix of order n with spectral norm at most n^'-' . Then, 



P(||(M + iV)-i|| >n^) <C3n 



-A 



P(||(M + iV„,p)-i||>n^)<C3n 



-A 



To conclude this section, let us derive a simple corollary of Theorem 2.9. 
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Corollary 2.10 (Condition number bound). Let A, Ci, C2, a be positive constants. 
There are positive constants B and C3 such that the following holds. Let x he a 
random variable with Ci-controlled second moment and Nn^p be the random matrix 
of order n defined as in Theorem 1.3. Let M be a deterministic matrix of order n 
with spectral norm at most n'~'^ . Then, 

P(||M + Ar„,^||||(M + Ar„,^)-i|| > n^) < Csn-^. 

Proof. A simple application of Chebyshev's inequality shows that 

P(|a;| > „-a-2_ 
Since HA^n./oH is bounded from above by maxj^"^^ we have that 

P(||Ar„,,|| > ^-A 

by the union bound. Combining this with the polynomial bound on ||M|| and with 
Theorem 2.9, the claim follows by choosing B sufficiently large. □ 

The condition number ||M||||M~^|| of a matrix M plays a crucial role in numer- 
ical linear algebra (see [3], for instance). The above corollary implies that if one 
pcrturbcs a fixed matrix AL by a (very general) sparse random matrix 7V„, the con- 
dition number of the resulting matrix will be relatively small with high probability. 
This fact has some nice applications in theoretical computer science (see [28] or 
[23], for example). 

3. Inverse Littlewood-Offord theorems 

Let us consider a toy case in order to illustrate the ideas behind the proof of 
Theorem 2.5. Assume, for a moment, that M = and x = N{0, 1) is real Gaussian. 
In this case, wc talk about the least singular value of the random matrix iV„ whose 
entries are i.i.d real Gaussian. Let Xi be the row vectors of Nn and di be the 
distance from Xi to the hyperplane Hi spanned by Xj, j ^ i. The least singular 
value of Nn is close (up to factors of n'-'^^'') to mini<i<„(ii. Thus, our goal is to 
prove that with high probability, each of the di is bounded away from 0. 

In this Gaussian case, the task is simple since, thanks to symmetry, the distribution 
of di does not depend on the vectors Xj, j ^ i. Indeed, di has the same distribution 
as the distance from a Gaussian vector to a fixed hyperplane. This variable is well 
understood and satisfies the inequality 

for any fixed positive constant A. This leads to the conclusion of Theorem 2.5 in 
this simple case. 
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However, the general case is mueh more difBcult. For example, if the entries of 
N are iid Bernoulli, it is already non-trivial to prove Nn is asymptotically almost 
surely non-singular (i.e. that with probability 1 — o(l), one has di ^ for all i). 
The point here is that one can no longer fix Xj,jy^i. As a matter of fact, the 
distribution of the distance di depends heavily on the position of the hyperplane 
Hi spanned by the Xj,j ^ i. For example, let x be Bernoulli and consider the 
following two situations 

• Hi has normal vector (-^, • • • ,-^). In this case, di = with probability 

• Hi has normal vector -^,0, . . . ,0). In this case, di = with probabil- 
ity i. 

A hyperplane H is, in some sense, bad for us if the distance from a random (row) 
vector to H is small with non-negligible probability. It is important to understand 
the bad hyperplanes. Notice that ii v — {vi, . . . ,Vn) is the unit normal vector of 
H, then the distance in question is exactly the random variable 



\viXi + . . . + VnXn\, 

where Xi are i.i.d. copies of x. 

This naturally leads to introducing the following concept. 

Definition 3.1 (Small ball probability). Let a; be a complex random variable, and 
let V = {vi, . . . ,Vn) be a tuple of complex numbers. We define the random walk 
(v) to be the complex random variable 

(4) Wa;{v) := ViXi + . . . + VnXn 

where .xi, . . . , .t„ = x arc iid copies of x. For any z G C and r > 0, we let B{z, r) 
denote the closed disk of radius r centered at z. For any r > 0, we define the small 
ball probability 

Pr,^{w) := supP(W,,(v) e B{z,r)). 

Intuitively, we expect the small ball probability ^.(v) to be quite small for "most" 
tuples V. The question, of course, is to quantify "most" . 

A classical theorem of Littlewood and Offord [17] (sec also [6]) shows that if x is 
Bernoulli, and all \vi\ > 1, then pi,a;(v) = 0(n^^/^). There are several extensions 
of this result. They, typically, improve upon the bound 0(— n^/^), under extra 
assumptions on the Vi. We are going to refer to results in this spirit as forward 
Littlewood- Offord theorems. 

For our purposes, we need inverse Littlewood-Offord theorems. Such a theorem is 
supposed to give a characterization of those vectors v, where Pr,v is larger than some 
lower bound. The study of inverse Littlewood-Offord theorems was started in [27] , 
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where we investigated the case when x has discrete support. A new result in this 
spirit was recently obtained in [21], where the authors investigated sub-Gaussian 
distributions, as well as distributions with bounded third or fourth moments. 

In the current situation, we only assume that x has 0(l)-controlled second moment. 
The weakness of this assumption is a major obstacle and makes the proof much more 
complicated. It is still possible to obtain a reasonably strong characterization of 
V, given that Pr,a:(v) is large. However, this characterization is somewhat technical 
to state and so we will only explicitly state here a corollary of it, which will be 
sufHcient for our purpose of proving the least singular value bound and the circular 
law. 

Let a; be a complex random variable. Let n be a positive integer and (3,p be 
positive numbers that may depend on n. Let Sn,x,p,p be the set of all unit vectors 
V = {vi, . . . ,Vn) € C" such that one has the concentration bound 

P/3,x(v) > p. 

We give C" the norm 

||(wi,...,f;„)||cx, := sup \vi\. 

l<i<n 

Theorem 3.2 (Inverse Littlewood-Offord theorem). Let x be a complex random 
variable which has K-controlled second moment for some k > 0. Let < e < 1. 
Then, for all n which are sufficiently large depending on k, e and 13 > exp(-n'^/2) 
and p = n~'-'^^\ there is a set S' C C" of size at most n(~^/^+'^)"p~" + exp(o(n)) 
such that for any v e Sn,x,i3,p there is v' e S' such that ||v — v'||oo < P- In other 
words, Sn,x,/3,p has a maximal p -net in the l°° norm of size at most n^~^^'^~^^^"'p~" + 
exp(o(n)). 

Theorem 3.3 (Inverse Littlewood-Offord theorem for sparse random variables). 
Let X be a complex random variable which has K-controlled second moment for some 
K > 0. Let < £ < 1. Then, for all n which are sufficiently large depending on 
K,e and (3 > cxp(— r?,"^/^) and p = n~^^^\ alll/n < ^ < 1, and all m between n"^ 
and n^~^iJ, there is a set S' C C" of size at most n^'^^^'^(p^/m)~'^ -\- exp{n'-"^^^m/ fi) 
such that for anyw e Sn,xi^,0,p there is v' e S' such that such that ||v — v'||(x> < (3. 
In other words, Sn,x\^.fi,p has a maximal (3-net in the l°° norm of size at most 
n°(^)"(pv^)-" + exp(nO(^)TO/^). 

Remark 3.4. If one sets m = n}~'~^^iJ, for some absolute constant C then the conclu- 
sion of Theorem 3.3 is similar to that in Theorem 3.2 except for the extra term ^J]l 
in Theorem 3.3. However, for our applications it will be slightly more convenient 
to choose m at the other extreme, thus m = . The main point here is that the 
size of Sn,x,p,p (or Sn,xi^,p,p) tends to be much smaller than 

Definition 3.5 (Entropy). Let A be a precompact subset of a metric space X, and 
let £ > 0. We define the internal metric entropy Afe{A) to be the cardinality of the 

largest e-nct in A (i.e. a set B G A where any two elements in B arc separated by 
distance e). We define the external metric entropy M'^{A) to be the least number 
of closed £-balls in X needed to cover A. 
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One easily verifies that 

M2M)<K{A)<K{A), 

and furthermore in the complex plane X = C we have A/2£(^) = Q{Me{-A)). As 
constant factors will not play any important role, the two notions of entropy will 
be essentially equivalent for our purposes. 

Since ||v||oo > we have the following corollary. 

Corollary 3.6. Let x he a complex random variable which has K-controllcd second 
moment, for some constant k > 0. Let e be an arbitrary positive constant. Then 
for any positive numbers iJ,,/3,p< 1 and all sufficiently large n we have 

Af0^U2{Sn,.,p,p) < n(-i/2+-)>-" + exp(o(n)) 

and 

^0nU2{Sn,.i,,0,p) < n(-i/2+^)"(pVM)-" + exp(o(n)) 

Remark 3.7. In fact, the proof of Theorem 3.2 gives a fairly precise description 
of the set Sn.x.f3,p, as is the case with other inverse Littlcwood-OfFord theorems 
in the literature. However, this description is somewhat technical to state and we 
only need the entropy bound on Sn,x,0,p in our application, so we have presented 
Theorem 3.2 in the above short (but less explicit) form. 



4. Concentration probabilities and Fourier analysis 

Throughout this section x will be a fixed complex random variable with 0(1)- 
controUed second moment. For any < < 1, let x^^^ be the random variable 

(5) := {x^ - X2)l^ 

where Xi,X2 are iid copies of x and Im is independent from xi,X2- 

Example 4.1. If x is the Bernoulli random variable P(a; = +1) = P(x = —1) = 1/2, 
then G {0, +2, -2} with P(a;('') = +2) = P(a;('^) = -2) = /i/8. 

For any < /U < 1 and any tuple v = (wi, . . . of complex numbers, define the 
concentration probability 

(6) P^(v) :=Eexp(-7r|W,(rt(v)|2). 

This quantity will turn out to be very convenient for controlling the small ball 
probabilities of W^(^,){Y) (see Lemma 4.3 below). To do that, we first need a 
Fourier-analjdiic representation of P^(v). We introduce the characteristic function 
/ : C ^ R, defined by 

(7) f{z) := \E{e{Re{xz)))\^ 
where e is the standard character 



e{t) := e^-v^*. 
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Lemma 4.2 (Fourier representation). For any tuple v = [vi, . . . ,Vn) of complex 
numbers and any Q < ^, <1, we have 

« n 

(8) Pm(v) = n (l - f + cxp(-7r|en di. 

Here of course is Lebesgue measure on the complex plane C. 



Proof. From the Fourier identity 

(9) exp(-7r|2|2) = / e(Re(^z))exp(-7r|^|2) 

Jc 

and (6) we have 

(10) Pm(v)= / Ee(Re(ew;w(v)))exp(-7r|C|') c^C- 

Jc 

On the other hand, from (4), (5), (7) and independence we see that 

n 

Ee(Re(^M/,(.)(v))) = [](1 - | + |/(^^,)) 
and the claim follows. □ 

The relevance of concentration probability to the small ball probability is provided 
by the following lemma: 

Lemma 4.3 (Concentration probability bounds small ball probability). For any 

tuple V and any r > 0, we have 

Pr,x(v) < e-'-'Pi(v). 

2 

In applications, r will be very close to and so the term e^^ can be ignored. 

Proof. Prom Definition 3.1, it suffices to show that 

P(W^,(v)eB(^,r))<e-'-'Pi(v) 
for any z G C. Notice that 

P(W,(v) e B{z,r)) < e"'-'Eexp(-7r|W,(v) - z\''). 

Applying (9) as in the proof of the preceding lemma, wo have 

Eexp(-7r|H^,(v)-^n= / Ee(Re(^W,(v)))e(-Re(^^)) exp(-7r|Cp) 

Jc 

The quantity |Ee(Re(^Wj,(v)))| can be expanded, using (4) and (7), as nr=i /(^^^O^^^- 
Since /(Cwi)^/^ < i + i/(^t,.), it follows that 



|Ee(Re(^W.(v)))|<f[(i + i/(^t;,)). 
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The claim of the lemma follows from the triangle inequality and Lemma 4.2. □ 
We now generalize the above lemma to the sparse case: 

Lemma 4.4 (Concentration probability bounds small ball probability, sparse ver- 
sion). For any tuple v and any r>0,l>/x>0, we have 

Pr,xi,(v) < e-'-'P^(v). 

Proof. The proof is almost identical as the previous one. The only difference 
here is that we have |Ee(Re(^Wa,i^(v)))| instead of |Ee(Re(^l^a;(v)))|. Notice 
that |Ee(Rc(CT4^^i^(v)))| can be expanded as nr=i((l - m) + M/C^^^i)^^^)- Since 
fi^Vi^^^ < i + it follows that 

n 

|Ee(Re(eW^.i,(v)))| < H ((^ " ^) + f /(^'-)) ' 

i=l 

and again we are done using Lemma 4.2. □ 

The concentration probability has several pleasant properties (cf. [27, Lemma 5.1]): 

Lemma 4.5 (Properties of P/i). Let < /x < 1. Then the following properties 
hold. 

(i) The quantity Pij,{w) is monotone decreasing in fi and permutation invariant 
in w. 

(ii) For any tuples v, w we have 

P^(vw) < P^(v) 

where vw is the concatenation ofv and w. 

(iii) For any tuples v,w we have 

P4v)P^(w)<2P^(vw). 

(iv) For any k > 1 and tuple v we have 

Pm(v) < P^/fe(v'=) 

where v*^ is the concatenation of k copies of v. 

(v) For any tuples v, wi, . . . , we have 

(m 
i=i 

In particular, by the pigeonhole principle, there exists 1 < i < m such that 
P^(vwi...w„)<P^(vwr). 

Proof. Properties (i), (ii) are immediate from (8). To prove (iii), observe from (6) 

that 

P^(v)P^(w) = Eexp(-7r(|W^,w(v)|2 + (w)!^)) 
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where we require the walks W^(p.) (v), Wj.(^) (w) to be independent. Using the arith- 
metic mean-geometric mean inequality 

|W^x(M)(v)i^ + |W.(.,(w)|2 > ^r,(.)(vw)|2 
followed by the Fourier identity 

exp(-7r|2|V2) = 2 / e{Re{^z)) exp{-2n\^\^) 
Jc 

we conclude that 

P^(v)P^(w)<2 / Ee(Re(ew;w(vw)))exp(-27r|en 
Jc 

Comparing this with (10) we obtain the claim. 

The inequality (iv) follows easily from (8) and the elementary inequality 1 — t < 
(1 — t/kY' for all < t < 1, which follows from the convexity of log(l — t). Finally, 
the inequality (v) follows from (8) and Holder's inequality. □ 



5. The x-norm of a complex number 



In this section, we present a way to estimate the characteristic function / (and hence 
the concentration probabilities P^(w)) in terms of a more convenient expression. 
Define the x-norm of a complex number w G C by the formula 

(11) \\w\\,:= (E\\Re{w{x, - X2))\\l^^y^^ 

where xi,X2 are iid copies of x, and ||t||R/z denotes the distance from t to the 
nearest integer. 

Example 5.1. If x is Bernoulli, then \\w\\x = -^||Re(2ii;)||R_/z. So in this case the 
a;- norm of w is basically the size of the fractional part of Re(2w). 

Lemma 5.2 (Relationship between / and x-norm). For any w G C and < /U < 1 

(l-f) + f/H<exp(-J7(M||HI^)) 
and thus by Lemma 4-2 we ha,ve 

(12) P^(w) < ^exp (^-n (^mE ll^^^ll')) exp(-7r|Cp) d^ 
for any tuple w = {wi, . . . , Wk)- 



Proof. In view of the elementary inequality l — t< exp{—t) for t > 0, it will suffice 
to show that 

f{w)<i-n{\\w\\l). 

But from (7) we have the identity 

f{w) = ReEe(Re(w(xi - X2))) = Ecos(2'7rRe(w(a;i - ^2))) 
and the claim follows from the elementary inequality cos(27r^) < 1 — 0(||^||^ y^)- ^ 
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We now record some useful properties of the a;-norm, which may help explain why 
we call it a "norm" : 

Lemma 5.3 (Properties of a;-norm). The following properties hold: 

(i) For any w £ C, < ||w||a; < 1 and \\ — w\\x = \\w\\x- 

(ii) For any z,w G C, \\z + w\\x < \\z\\:i: + \\w\\x. 

(iii) // X has n-controlled second moment for some positive constant k, then 
there exists a positive constant c depending on k such that \\z\\x ^ |Re(2)| 
for all z e B{0,c). 

Proof. Property (i) is obvious. Property (ii) follows from the triangle inequality for 
L"^ and the elementary observation that ||x + y||R/z < ||a;||R/z + ||y||R/z- 

Now we prove (iii). Let z G -B(0, c) for some small c. Prom (11) it suffices to show 
that 

E||Rc(z(xi-a:2))||^/z» \Mz)\^- 
On the other hand, from (1) we have 

B\Re{x)\H{\x\<K)>^ 

for some K ~ 0(1). In particular P(|x| < K) ^ 1. So if we let j/j for i = 1,2 he Xi 
conditioned on the event \xi\ < K , it suffices to show that 

E||Re(^(j/i-2/2))||^/z»|Rc(z)p. 

If c is small enough depending on K, then \z{yi — 2/2)! < 5, so it suffices to show 
that 

E|Re(0(t/i-y2))|»|Re(-j)p. 
But this follows by conditioning on j/2 and then using (1). □ 

6. Generalized arithmetic progressions and the forward 
Littlewood-Offord theorem 

As in previous literature, our Littlewood-Offord theorems shall involve generalized 
arithmetic progressions (GAPs), which we now define. 

Definition 6.1 (Generalized arithmetic progression). If vi,...,Vr are complex 

numbers and Li, . . . ,Lr are positive numbers, we define the symmetric generalized 
arithmetic progression (or symmetric GAP for short) 

Q = Q,{{Vl,...,Vr),{Li,...,Lr)) 

to be the set 

Q := {nivi + . . . + nrVr\ni, . . . ,nr € 7,;\ni\ < Li for all i} C C. 

We refer to r as the rank of Q, vi,. . . ,Vr as the generators, and Li,. . . ,Lr as the 
dimensions. 
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If all the sums nivi + . . . + rirVr are distinct, we say that Q is proper. For t > 0, 
we define the dilate tQ of Q as 

tQ := Q{{vi, . . .,Vr), {tLi, . . .,tLr)). 

Finally, if Li = ... = Lr = L, we abbreviate Q((ui, . . . , Vr), (L, . . . , L)) as Q((wi, . . . , Vr), L). 



GAPs are a fundamental object in additive combinatorics and they have played a 
crucial role in our earlier papers on Inverse Littlewood-Offord theorems and least 
singular values [27, 28]. For a detailed discussion about these objects, we refer to 
[25]. 

Remark 6.2. It is helpful to view Q as the imago of the integral box 

{(ni, • • • ,nr-)||ni| < L^, 1 < ? < r} C Z'' 

under the linear map $ that sends the point (m, . . . , Ur) to nwi + • • • + rirVr- Q is 
proper if $ is one-to-one. 

We use the following two simple lemmas frequently: 

Lemma 6.3 (Doubling property). Let Q be a symmetric GAP of rank r and t>l. 
Then 



#{tQ) f#Q. 



Proof. One can cover tQ by 0(t'") translates of □ 

Lemma 6.4 (Pigeonhole principle). Let Q C C be a finite set, and letQ. d C be a 
set which can be covered by at most M balls of radius r/2. Then we have 



#((g-g)nB(o,r)) > 



M 



Proof. We can of course assume that Q n is non-empty. By the pigeonhole 
principle, we can find a ball B{z, r/2) of radius r/2 which contains at least #(Q fl 
il)/M elements of Q n fJ; in particular it contains at least one element zq of Q. 
Since {Qr\Qr\B{z,r/2)) — zq is contained in {Q — Q)r\B{0, r), the claim follows. □ 



For a GAP Q = Q{{vi, . . . , Vr), {L\, . . . , Lr)), define the dispersion D(Q) to be the 
quantity 

Remark 6.5. The quantity T){Q) is very close to the metric entropy M\{Q) of Q, 
indeed simple volume packing arguments (cf. Lemma 6.4) show that D((5) = 
Qr(A/i((5)). We will however not use that fact here. 



This quantity turns out to control the concentration probability of certain random 
walks associated with Q: 
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Theorem 6.6 (Forward Littlewood-Offord theorem). For any < < 1, e > 0, 
and complex numbers vi,. . . ,Vr, we have 

P^(^;^'^..^;f^)«e,.D(Q)-l+^ 

where 

This "forward Littlewood-Offord theorem" will be crucial in establishing Theorem 
3.2. To give the reader some feeling about this estimate, let us first consider a toy 
case when x is Bernoulli and /i = 1. Tlic' adjusted random variable a;*^'*' equals 
with probability 3/4 and ±2 with probability 1/8. 

Assume furthermore that the Vi arc non-zero integers and Q is proper. Thus Q fl 
i?(0, 1) C {—1,0, 1} and the desired bound becomes 

Pi(t;f'...t;^'')«,,. (#Qri+^ 

Consider a (lazy) random walk W starting at 0. At step j, stay with probability 
1/2 and move to right or left by an amount Vj with probability 1/8. The terminal 
point after n step is exactly the random variable 

W^(i) (Ul . . . Vn) = X^^\i H h x'-^'^Vn- 

Since P/i(v) := E exp(— ttIM^j-c^) (v)p), the quantity Pi(vi ^ . ..Vr'') can be bounded 
from above by the sum of the probability that the lazy random walk with L\ steps 
of size wi , . . . , steps of size v^. ends up on a point with absolute value at most 
101og(#Q) and a negligible term which is much smaller than (^Q)"^- 

Notice that the coefficient of Vj is the sum of Z/| iid copies of x^^^ . It is well known 
that the distribution of this sum is roughly uniform on the interval [—Lj,Lj]. (By 
roughly uniform, we mean that for any two integers in this interval, the ratio of their 
masses is bounded from above by a positive constant.) Thus, the main observation 
here (and somehow the essence of the theorem) is that the end point of the walk 
(conditioned on the fact that the coefficient of Vj belongs to [—Lj,Lj] ) is roughly 
uniform in Q. It follows that the probability that it has absolute value 0(log#(5) 
can be bounded from above by 0( ^°^g'^ ) < #Q~^~^^, giving the desired bound. 

This argument can be made rigorous for random variables x with discrete sup- 
ports, even when Q is not proper. However, the proof for the general case is more 
complicated. The main technical tool needed is the following level set estimate: 

Lemma 6.7 (Level set estimate). Given a GAP Q = Q{vi, . . . ,Vr, Li, . . . , L^), a 
complex number ^o, and e > 0, letT, c C be the set 

(14) S := e B{^o, l)\Uvi\U < r>{QT/Li for alll<i<r}. 
Then 

(15) mes(E) B{Q)-'+Or(^\ 
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We will prove Lemma 6.7 in Sections 7-8 below. For now, let us show how it implies 
Theorem 6.6. 



Proof of Theorem 6.6 assuming Lemma 6.7. We abbreviate D := D((3). In view 
of (12), it suffices to show that 



/ exp(-fi(/i^L2||^^;,||2))exp(-7r|e|2) D" 
•'^ i=i 

Covering C by balls of radius 1, it thus suffices to show that 

-l+£ 



l+£ 



/ exp(-r!(^ ^^L^Mv^\\l)) D- 



for all Co € C. 



Now we fix Co- Let c be a small positive constant to be determined. It is clear that 
if D is sufficiently large, then 

/ cxp(-f2(VMi2||^^.||2)J rf^<ines(S)+D-i 

where 

S := {C e B(Co, < ^=r fo"^ all 1 < i < r}. 

(In fact, D'^^ can be replaced by ClogD for some large constant C.) By Lemma 
6.7, 

mes(S) < -D-^+oice)^ 

We choose c equal half of the reciprocal of the hidden constant in O. It follows that 

mes(E) < D-i+=/2, 

which implies 



l+£ 



concluding the proof. □ 

To conclude the proof of Theorem 6.6, we need to establish Lemma 6.7. This is the 
purpose of the next two sections. 

7. Lacunary sets inside gaps 

Let 5 be a set. We shall informally call a sequence Wi, . . . ,Wd of elements of 5* 
lacunary if the ratio ^^j^jj^ is large for all 1 < i < d. The goal of this section 
is to show that a large GAP always contains a large lacunary subset with some 
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prescribed properties. This fact will be a key ingredient in the proof of Lemma 6.7 
(and hence Theorem 6.6), which we give in the next section. 

To give the reader some motivation, let us consider the toy case when Q is an 
interval, say {— s, — s + 1, . . . , s — 1, s}. Given a ratio K >2 and a constant R> 1 
(say), wc can easily find d elements Wi,. ..,Wd such that \wd\ > R and \ ]^.^^ \ > K 
where d satisfies 



The main result of this section is a generalization of the above observation for 
general GAPs. 

Lemma 7.1 (Lacunarity lemma). Let K > 1, let Q be a symmetric GAP of 
rank r, and let R > be a radius. Then there exists, for some d > 0, "primary 
vectors" wi, . . . ,Wd & Q, and "secondary vectors" w'l, . . . Q with the following 

properties: 



(i) 
(ii) 
(iii) 



(Lacunarity) We have \wi\ > K\wi+i\ for alll <i <d~l. 



(Secondary bounds) We have \wi 
(Many vectors) We have 



> R and \w[\ < \wi\ for all 1 < i < d. 



(16) 



#<3< 



l[Or{KKi 



#(QnB(0,i?)) 



where 1 < Ki < 1 + K is the quantity 



(17) 



Ki:=l + K 



(iv) ( Crude upper bound) We have 



(18) 



log 



Im(^) 

Wi 



#(QnB(0,il)) 

logi^ 



Remark 7.2. The secondary vectors arc necessary here because Q is taking values 
in the complex numbers; if Q C R then we could simply take = (and thus 
Ki = 1) for all i. The reader may wish to follow the argument below in the real 
case (and for i? = 0), as it is somewhat simpler in that case. The bound (16) may 
seem strange, but it is best possible except for the Or(-) factor, and we will need 
such a tight estimate in our applications. The vectors wi, . . . ,Wd,w'-i, . . . ,w'^ are 
somewhat analogous to the Minkowski basis of a lattice with respect to a convex 
body, thus (16) can be viewed as a variant of Minkowski's second theorem. 



Proof. By increasing K if necessary we may assume K to be larger than any given 
constant depending on r. We can also assume that Q is not contained in B{0,R), 
as the claim is obvious otherwise. 

We perform the following algorithm. We set do := Cr (l + ^^^^j^^^ for 
some sufiiciently large constant Cr depending only on r. 
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Step Initialize i = 1. We also adopt the convention that wo = oo. 

Step 1 Let := 2-^"+*0 n B{0, \wi-i\/K). If Qi c B{0,R) then set d := z - 1 

and STOP. Otherwise, let Wi G Qi be chosen such that \iUi\ is maximal; 

thus \wi\ < \wi-i\/K, \wi\ > R and Q c -8(0, \wi\). 
Step 2 Let w- € be chosen to maximize the quantity Ki defined in (17). Observe 

that \wl\ <\wi\. 

Step 3 From elementary complex geometry we see that Qi is now contained in a 
rectangle of dimensions 0(|wi|) x 0{^\wi\). This rectangle can be covered 
by 0{KKi) disks of radius \wi\/2K. Applying Lemma 6.4, we conclude 
that the set 

Q^+l ■■= 2-'^"+'+iQ n B{0, \w,\/K) D {Qi - Qi) n B{0, \wi\/K) 
obeys the lower bound 

#Qi+i » YK*Qi- 

Increment i to i+1 and return to Step 1. 



Since wi,W2, - ■ ■ have decreasing magnitude and lie in the finite set Q we see that 
this algorithm terminates in finite time. In fact we claim that this algorithm ter- 
minates before step do- For if the algorithm reaches stage do, we have obtained 
Wi, . . . jWdg e Q obeying the lacunarity condition \wi\ < This implies 

that the GAP Q{{wi, . . . ,Wdo), K/10) is proper, and that the pairwise sums be- 
tween Q((wi, . . . , Wrfo), X/IO) and 2Q fl B{0, R/10) are distinct and contained in 
{doK + 1)2(5. But this implies that 

iK/10f'#{2Q n B{0, R/10)) < #{doKQ) < 0{doK)^#Q. 

Also, since B{0,R) can be covered by 0(1) balls of radius R/20,we see from Lemma 
6.4 that 

#(2Q n B{0, R/10)) > #(Q n B{0, R)) 

and thus 

But from definition of do , we see that this is impossible if Cr is chosen sufficiently 
large (recall we are taking K large compared to r). Thus we have d< do, which in 
particular implies that wi, . . . ,Wd and w[, . . . ,w'j^lie in Q. Since Q is not contained 
in B{0, R) we also have d> 1. 

Next, we observe from (19) that 

#(gnB(o,i?))>#gd+i> 

Now we can cover Q by Or(l)* copies of Qi = 2~*+^g, and thus 




(20) #g < 



llO{KKi) 



o,(i)'^«#(gnB(o,i?)). 
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In particular, since K + 1 > Ki and do > d wc have 

using the definition of do and recalling that K is large compared to r we conclude 
that d do- The claim (16) now follows from (20). The remaining claims are 
easily verified from the construction. □ 



8. Proof of Lemma 6.7 



We are now ready to prove Lemma 6.7. In the following, Q is fixed and we write 
D instead of D(Q). We also fix r and allow all implied constants to depend on r. 
Wc may assume without loss of generality that D is large compared with e, since 
the claim is trivial otherwise. 

Let K D^; since D is assumed large compared to e, wc see that K is also. We 
apply Lemma 7.1 (with R= 1, and to the GAP -^Q) to obtain vectors 



(21) 



for some d ~ 0(l/e) such that \wi\ > K\wi+i\ for all 1 < i < d — 1, \wi\ > 1 and 
Iw'A < \wi\ for all 1 < « < d, and 



llO{KK,) 



#(gnB(o,i)) 



where Ki is defined in (17). Since Q has rank 0(1), we have 

# (;^g) » K-°^^^#Q = Di-o(^)#(Q n 5(0, 1)) 
and thus (since d = 0(1/ e) and D is large compared with e) 



(22) 



l[KKi > Di 



■0(e) 



Prom (14), Lemma 5.3, and (21) we see that 



(23) 

for all 1 < i < d and ^ e S. 

For 1 < i < d, define Q 



and C 



1^ 



1 

KKiWi 



Let P be the GAP 



P:=Q (Ci,...,C<i), 



K 
100 



)+q((c;,...,c^),(^,.. 



100 



One should view P as a kind of "dual" to Q. It has the following properties: 
Lemma 8.1 (Properties of P). We have 



(i) P is proper. 

(ii) #P> Di-O(^). 
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(iii) P<lB{Q,0{l/K)). 

(iv) If z,z' e P are distinct, then z + 'E and z' + 11 are disjoint. 



Proof. We first verify (i). If P is not proper, then we have a linear relation 

mCi + • • • + ndCd + miCi + ■■■ + mdCd = 

for some integers rii, . . . , Ur, m,i, . . . , mj., not all zero, with \ni\ < K/50 and \mi\ < 
Ki/50 foY 1 < i < r. Let j be the largest index such that {nj,mj) is non-zero. If 
^ < i < j, then from the properties of Wi we have 

\wi\ > K^~'\wj\ 

and so 

\Ci\<M-v \cl\< 



Prom the triangle inequality we then have 

ImCi + . . . + n^-iO-i + miC[ + ... + mj-iCj_i \ < ^ 

and thus 

InjCj+rrijQ < 



101 



On the other hand, since nj,mj are integers which are not both zero, and = 
V— T-^Cij -iiid K/Kj > 1/2, we see that 



|n,o+m,-c;i>M, 



a contradiction. 

Prom (i) we also see that 



#p > f] n(—)Q(^) 

^ -li Mnri'' Mfifi'' 



100' noo' 

i=l 



and so (ii) now follows from (22) (recalling that d — 0{l/e) and D is large compared 
to s). 

Now we prove (iii). If z G P, then we see from the triangle inequality that 



by lacunarity. But by construction \wd\ > 1, and the claim follows. 

Now we prove (iv). If the claim was false, then we could find distinct z,z' € P and 
^, ^' G S such that z — z' = ^ — ^' . We can then write 

z- z' = niCi + . . . + UdCd + TOiCl + • • • + mdC^d = C - C' 

for some integers ni, . . . , n^, mi, . . . , m,., not all zero, with |nj| < JsT/SO and |TOi| < 
i^i/50. 
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Let j be the largest index such that {nj,mj) is non-zero. From (23) and Lemma 
5.3 we have 

(24) \\{z - z')wj\U, \\{z - z')w'j\U « 

On the other hand, from the triangle inequality we have 



by lacunarity, and thus 



If K is large enough, then wo can apply Lemma 5.3 to conclude from (24) that 

Re((^ - z')wj),Re{{z - z')w'^) = 0{^). 
On the other hand, observe that 

Iz-z'- (n,0 + n.,Q\ < ^ g + mil « i E ^ « 

Z=l 1=1 ' ' I J I 

and so by the triangle inequality 

\Re{{njCj + mjCj)wj)\, \M{njCj + mjCj)w'j)\ < ^ 

if K is large enough. On the other hand, by construction of Q , ('j we have 

nj_ 



Rc((n,0 + m,(;pu;,) - ' 



Since iij is an integer, we conclude Uj — 0. In that case wc have 

|Re((n,C, +m,Cj)^)l = ^IM J)l > ^ 

if Kj > 2, by construction of ('j and Kj. Since rrij is an integer, we conclude mj = 0. 

On the other hand, if Kj < 2, then wc have rrij = as well, since \mj\ < Kj/50. 
But {nj,mj) is non-zero, a contradiction. □ 

Prom properties (ii), (iii), (iv) we see that 

mes(B(0,O(l))) > Di-°(^)mes(E) 
and the claim (15) follows. 



9. Structure of weak elements 



Let Q be a GAP. Extend Q by a new dimension generated by a new clement z] 
Q' = Q + {—kz, • • • , kz}. We call z weak if #(5' is only slightly more than #(5. 
The goal of this section is to quantiiy (and generalize) the following phenomenon: 



The set of weak z has small entropy. 



THE CIRCULAR LAW 



25 



The reader may find the following simple example illustrative. Assume that Q is 
the interval [— s, . . . , s]. Assume that Q' := Q + {—kz, ■ ■ ■ , kz} has cardinality at 
most Is, where I = k^ for some small positive 5. 

Consider the interval Qi := {x G Z||.t| < sl/k}. The sets x + {z, - ■ ■ , kz}, x & Qi 
are subsets of Q'. Since #Qi > Is/k, these sets are not disjoint. Thus, we have 
x + jz = x' +j'z for some distinct x, x' € Q\ and 1 < j ^ j' < k. This implies that 



z^ U 



l<r<fe 



This already gives a bound kj^{Q\ — Q\) = 0{l^Q) = 0{ls) on the cardinality of 
the possible z. But this bound can be improved further (this improvement is critical 
later on). Consider the set a; + {0, • • • , Iz} with x G Q. By the same argument as 
before, these sets are not disjoint, and we can conclude that 

U ^-{Q-Q). 

l<r'<l 

Thus, z has two representations 

X x' 

T t' 

for x G Qi — Qi, 1 < T < k and x' £ Q — Q,l < t' < I. If ^ is irreducible, then 

r < I and the number of z's of this form is only at most l#{Qi — Qi) = 0{^s). If it 
is not, then gcd(x, r) > j. The number of x satisfying this condition in Qi — Qi is 

at most 0{^#Qi). Thus, the number of z's is at most X;^^; 0{!^#Qi) = 0{!js), 
using the bound on #Qi and the fact that I = k^^^\ Thus, altogether we obtain 
the bound 



which is much better than the previous bound 0{l4t^Q). The term k~^ will play a 
critical role in later proofs. 

The main result of this section is a generalization of this very special case. 

Lemma 9.1. Let wi, . . . ,Wr be complex numbers and Q = Q{{wi, . . . , Wr), {Li, . . . ,Lr 
Let z be a complex number and k a positive integer. Define 

Q' :=Q + Q{z, k) = Q + {-kz, kz}. 

Let Z denote the set of all complex numbers z such that 

D(Q') < /D(Q). 

Then Z has a 2A-net of size at most 1 + Or{l'^k~^T>{Q)) . 

Remark 9.2. The 24-net can be replaced by an 1-net if we replace the bound 1 + 
Or{l'^k~^'D{Q)) by Or{l + l'^kr^T){Q)). However, it is important to us to have 
the current formulation, as in the case when l'^k~^T){Q) = o(l) the net will have 
size exactly 1. The power of might be improvable, but we will not need this 
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improvement here, as I will always be relatively small for us compared to other 
parameters such as k and T){Q). 



Proof. Let z G Z. By definition of Z, we have 



#iQ + Q{z,k)) #Q 



#((Q + Q(z, k)) n B(0, 1)) - #(Q n 5(0, 1)) ■ 

Let W C be a maximal 1-net of ^Q, then we see that the sets w + {Qr\B{0, 1)) 
for w e W cover iQ, and thus 



#(Qns(o,i)) #(QnB(o,i))' 

thanks to the easily verified fact that #(1(3) '>r #Q- 

Refine the 1-net W to & maximal 2-net W . We have >r #VF and thus 

(25) #(Q + Q(z, fc)) «, m{Q + Q(z, fc)) n 5(0, 1))#W. 

Now, define the set 

L := {-2k <j< 2k\jz &2Q + B{0, 2)}. 

A simple greedy algorithm argument (using the symmetry of L) shows that we can 
find a set J C {—k, . . . ,k} of cardinality # J ^ such that ji — j 2 ^ L for any 
distinct ji, j2 & J- Consider the sets jz + W' + {{Q + Q,{z, k)) n B{0, 1)) for j e J. 
By the construction, we can verify that 

(a) These sets are disjoint (thanks to the definition of J and L). 

(b) Every set lies in 2{Q + Q(z, k)) (since |j| < k and W C ^Q). 

(c) Each set has cardinality {#W')#{{Q + Q{z,k)) n B{0, 1)) (since W is a 
2-net). 

It follows that 

# J(#W^')#((Q + Q{z. k)) n 5(0, 1)) « #(2(Q + Q(^, fc))) #(Q + Q{z, k)). 
Combining this with (25) we conclude that 

#J <r I- 

On the other hand, # J I:^ ^ , so 

(26) #L >^ Z-ifc, 
which asserts that many multiples of z are close to 2Q. 

Let i?o be the smallest radius such that 

(27) # (log n 5(0, i?o)) > Crlk-^i^Q, 
where Cr is a sufficiently large constant depending on r. 
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By definition, 

(28) # (lOQ n B{0, Ro/2)) = Or{lk-^#Q). 

Assume, for a moment, that \z\ > 2Rq + 4. By the definition of L, we can find, for 
each j e L, an element Q G 2Q such that \jz — Cj\ < 2. (If there are many Q, fix 
one arbitrarily.) Let j and j' be two different indices, then 

IO-C,'l>l(j-/)lk|-4>|z|-4. 

This implies that the sets (j + {lOQ (1 B{0, Rq)) are disjoint. Furthermore, as 
Cj e 2Q, they all lie in 12Q. Therefore, 

(#L)# (lOQ n B{0, i?,o)) < #(120) «r #Q. 

But this contradicts (27) if we choose Cr sufficiently large. Thus we have 

\z\ < 2i?o + 4. 



If Rq < 10, then 2: < 24 and Z has a maximal 24-net of cardinality 1 and we are 
done. 

Prom now on, we assume Rq > 10. Thus \z\ < 3i?o- 

From (26) and the pigeonhole principle we can find j, j' G L such that < \j <Cr 
I. Thus there exists an integer < i -Cr I such that iz € AQ + B{0,4). Since 
\z\ < 3i?o, we have \iz\ <^r IRo and thus in fact iz € {'iQnB{0,OrilRo)))+B{0,4). 
Thus, to obtain the desired bound on J\fi{Z), it will suffice to show that 

Ma{aq n s(o, OriiRo))) i^k-^niQ). 



Let Z' bo any 4-net of 4Q n B{0, Or{lRo)). Observe that the sets C' + {Qn B{0, 1)) 
for (' e Z' are disjoint and lie in 5Q fl B{0, Or{lRo))- Thus we have 

{#Z')#{Q n B(0, 1)) < #(5g n B(0, Or{lRo))). 
Since Ti{Q) = ^^^q^b(o i)) ' suffices to show that 

#(5Q n S(0, OrilRo))) «r l^k-^i^Q. 

But (as wc arc working on the plane) we can cover B{0,Or{lRQ)) by Or{l^) balls 
of radius -Ro/4, so by Lemma 6.4 we have 

#(10Q n B{zo, Ro/2)) >^ r2#(5g n B(0, Or(iio)))- 

Comparing this with (28) we obtain the claim. □ 
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10. Proof of the inverse theorems 

We first prove Theorem 3.2. The proof of Theorem 3.3 can be obtained with some 
minor modifications. 

Let us begin with a simple reduction. Since x has O(l)-controlled second moment, 
from Chebyshev's inequality we see that |a;| > n^+^° with probability 0{n~^"^~'^°). 
Thus if we let x' be x conditioned on the event |x| < n^+^°, we see from the union 
bound that P/3,x(v) and P/3,x'(v) diflFer by at most 0{n~^^~^^). Thus (modifying p 
slightly if necessary) we may replace x hy x' , and so we may assume for the rest of 
the proof that 

(29) |x| < n^+i° = n'^(^) almost surely. 

Consider a point v in Sn,x,(i,p- Let V = (Vi, . . . , Vn) be the vector obtained from 
/3~^v/2 by rounding the coordinates to the nearest Gaussian integer multiple of 
^-A-2Q Clearly thus |V| = e(/3-i). Furthermore, by (29) 

Pl,cc(V) > P/3,x(v) > p. 

By Lemma 4.3, it follows that 

Pi(V)>p. 

We are going to find a small 0(l)-net (in the l°° norm) for the set of all possible 
V satisfying the last inequality. Set k := -n}/'^^^, and let d > 1 be an integer to be 
chosen later [d will be bounded by a constant.) 

Now we perform the following algorithm (following the proof of [27, Theorem 2.4]) 
to construct some elements , . . . , in V for some < r < d. 

Step Initiahze r = 0. Set V[°l = V. 

Step 1 Count how many Vj € Vl''! there are such that 

D(Q((wi, ...,Wr, Vj), k)) > n^D(Q((«;i, . . . , Wr), k)). 

If this number is less than k'^ then STOP. Otherwise, move on to Step 2. 
Step 2 Applying Lemma 4.5(v), we can find some Vj G V^'*"' such that 

D(Q((w;i, ...,Wr, Vj),k)) > n^D(Q((M;i, . . . , Wr), k)) 

and 

Pi(VM«;f ...wf)< Pi(V['-+il«;f . ..wfvf), 

where Vl'^+^l is obtained from VM by deleting a set of fc^ elements. We 
then set Wr+i := Vj and then increment rtor + 1. li r = d then STOP 
(with an error); otherwise return to Step 1. 

By induction, at each stage in this algorithm we have 

Pi(VM«;f )>p 
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and hence by Theorem 6.6 and Lemma 4.5(ii) 

D(Q((w;i, . . . , Wr), k)) « < nO(^)p-i = n^^^l 

On the other hand, by construction we have 

D(Q((u.i,...,«v),fc)) >n''^ 

Thus, the algorithm must terminate in Step 1 for some r = Os{l). At this point, 
we have obtained a tuple {wi, . . . ,Wr) of elements in V with r = 0^(1) such that 

(30) D(Q((m;i, . . . , Wr), k)) <e V 

and such that 

n{Q{{wi,...,Wr,Vj),k)) < rf'D{Q{{wi,...,Wr),k)) 
for all but at most rfc^ = Oe{n^~'^^) < n^~^ values of j. 

Now we have enough information to construct the net. First we show that it costs 
a relatively small factor to take' c;arc of the exceptional coordinates. There are at 
most Os (fc^) < n^~'^ exceptional values of j; we can fix the values of the exceptional 
j by paying a factor of 




= exp(o(n)). 



For each cxc('ptional j, Vj is a Gaussian integer multiple of 0{n~^^^^) of magnitude 
0(/3~^). Thus, the number of possible choices for Vj is /3~^n'^(^^. So, after we fix 
the exceptional coordinates j, there are at most 

(/?-^n°(i))"'"' = exp(o(n)) 
ways to specify the values of these coordinates. 

As for the remaining (non-exceptional) coordinates Vj, Lemma 9.1 (along with (30), 
the definition of k, and the bound r = Oe(l)) shows that each such Vj lies within 

distance 0(1) of a sot of cardinality 1 + Oe(n~-'^/^"'"'^(^'p~-^). The set of all vectors 
V has a 0(l)-nct in the 1°^' norm of size at most 

exp(o(n)) (l + Oe(n"'/'+°^'V"'))" = 0(n(-i/2+o(^))>-") + exp(o(n)) 
assuming n sufficiently large depending on p, e. 

Changing a 0(l)-nct to a 1-nct costs only a 0(1) factor. Thus, wo can conclude 
that there is an 1-net of size at most 0(n(~^/^+'^('^))"p~") + exp(o(n)). As we can 
choose e arbitrarily small, the proof of Theorem 3.2 is complete. 

To prove Theorem 3.3, we just use the sparse version of all lemmas used in the 
previous proof, except that we take k equal to y/rn/Ji rather than n^~^. The 
starting point is 

P^(V)»p. 

Instead of D(Q(('(i;i, . . . , Wr), k)), we will consider D(Q(('u;i, . . . , Wr), ^/JIk)). Thus, 
the gain from Lemma 9.1 is no longer k~^ (which used to lead to the term n"^/^"'"'^^^^ 
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in the final bound), but instead {k,/]I) ^ (which leads to the term n^^^\^/7n) 
Meanwhile, the exp(o(n)) factor is replaced with exp(n'^^^^A;^) = exp(n'^^^^m//u). 
The reader is invited to work out the simple details. 

11. Proof of Theorems 2.5 and 2.9 

Theorem 2.5 follows from the following. Let a-„(M) denote the least singular value 
of a matrix M of order n. We shall abbreviate N = Nn^p. 

Theorem 11.1. Let 7 > 6e such that 

• II M + 7V|| < n'^ with probability one. 

• |xi| + • • • + |a;n| < rf' with probability one. 

Then for any A,B>0 with 

(31) B > 27A + 37 + 1/2 

we have 

P(a„(M + N)< n-^) <A,i3,7,« n"^. 

Indeed, as x has finite second moment, we can assume that |a:| < n"*"*"^", at the 
cost of a (negligible) additional term o{n~^) in probability. Thus, by restricting x 
to the event |a:| < n^^^^ and using the assumption about M in Theorem 2.5, we 
can satisfy both assumptions in Theorem 11.1, for 7 large enough. 

Remark 11.2. We can have a more efficient form of the theorem by bounding the 
probability that the two assumptions on ||M + A^|| and |a;i| + • • • + |a;„| fail (rather 
than assuming that they hold with probability one). The relation between B and 
7, A can be strengthened and we will do that in another paper. 

We now prove Theorem 11.1. We suppress all dependence of the implied constants 
on A,'y,B,K. 

Let us call a unit vector v = {vi, . . . ,Vn) poor if we have 

Pn-«+i/2,x(v) < n-^-\ 

and rich otherwise. Theorem 11.1 follows directly from the following two lemmas 
and the fact that 

a„{M) = inf |Mv|. 

|v| = l 

Lemma 11.3 (Poor vectors are negligible). We have 

P(||(M + A'')v|| < for some poor unit vector v) <C n""^. 

Lemma 11.4 (Rich vectors are negligible). We have 

P(||(M + A'')v|| < for some rich unit vector v) <C n""^. 
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Proof. (Proof of Lemma 11.3) Wc repeat the argument from [28]. Let E be the 
event that \\{M + N)v\\ < for some poor unit vector v. If E holds, then the 
least singular value of M + A'^ is at most , and so the same is true for the adjoint 
(M + N)^ . Thus there exists a row vector w^' such that 

(32) \\^^HM + N)\\ <n-^. 

Write = {wi, . . . , w„). By paying a factor of n and using symmetry we may 
assume that the last coefficient of has the largest magnitude, thus 

(33) |w„| > \wi\ for all i. 
In particular, we have 

(34) \Wn\ > 

Thus, if we let F be the event that there exists a unit vector w obeying both (32) 
and (33), we have 

(35) I>{E) <nI>{E AF). 

Let Xi, . . . , Xn be the rows of M + TV. Wc shall condition on the first n — 1 rows 
Xi, . . . , Xn-i- Observe that if E holds, then there exists a poor unit vector v such 
that 

n 

(^ \Xi ■ vnV2 = ||(M + N)w\\ < n-^. 

i=l 

Thus, if 'P{E\Xi, . . . , Xn-i) is non-zero, then there exists a poor unit vector u such 
that 



(36) (5]|X,.u|2)V2< 



j=i 



On the other hand, if F holds, and = {wi, . . . , Wn) is as above, then by (32) 

ll^iXi + . . . + < n~^; 

taking inner products with the unit vector u and using the triangle inequality, we 
conclude 

n-l 

\Wn\\Xn • U| < ^ It^ill-^i • U| + n~^. 

i=l 

Using (34), Cauchy-Schwarz, and (36) we conclude 

\Xn ■ u| < + = 2n-^+i/2. 

On the other hand, since u is poor, and X„ is independent of Xi, . . . ,X„_i (and 
hence independent of u also), we have 

P(|X„ ■ u| < 2n-«+i/2|Xi, . . . , X„_i) < n-^-\ 

Putting all this together, we conclude that 

P{EAF\Xi,...,Xn-i)<n-^-^ 

uniformly in the choice of Xi,. . . , Xn-i- Integrating over Xi,. . . , Xn-i and using 
(35) we obtain P{E) < n~^, as desired. □ 
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Proof. (Proof of Lemma 11.4) Let e > be a sufficiently small constant (in partic- 
ular, smaller than the constant in Theorem 3.2); we allow all implied constants to 
depend on e. We may also assume that n is sufficiently large depending on e. 

Let J be the smallest integer strictly larger than 2A + 2, thus 2A + 2 < J <2A + 3. 
Thus, if we set S := {A + I)/ J, then (using (31)) we have < J < 1/2 and B > Jj. 
If £ is sufficiently small, we thus have 

(37) B > JC + 1/2 and (5 + 3£ < 1/2 
where 

(38) C:=7 + 2£. 

Let V be a rich unit vector. For j = 0,1, . . . , J, consider the quantities 

(v). 

These quantities are increasing in j, and range between n~^~^ and 1 since v is 
rich. Applying the pigeonhole principle and using the definition of S, we can thus 
find a positive < j < J — 1 such that 

P„-i3 + C(j + l) + l/2^^(v) < n^Pj^-B + Cj+l/2^^(w). 

Define, for any < j < J — 1 and 1 < fc < \{A + l)/s'\ , the set Slj^k as 

:= {v|(v rich) A(p„-B+co+i)+i/2,a,(v) < n''p„-B+c,+i/2_^(v))A(p„-B+c,+i/2_^(v) e [n"'^^, n"^''"^^^))}. 

Since the number of pairs j, k is 0(1), it suffices by the union bound to show that 

for each fixed j, k 

(39) P(||(M + iV)v|| < for some unit vector v G %,fc) = o(n"^). 

In fact, we are going to show that this probability is exponentially small. 

Let p := n~^^ . In the notation of Theorem 3.2, v lies in 3, „-B+c3+i/2_p. Thus by 
this theorem, there is a set V of cardinality at most 

#y < n-"/2+^"p-» + exp(o(n)) 
such that for each v G VLj^k there is v' G V" such that ||v — v'||oo < n^-^+'^^+^Z^. 

Consider v G ^j^k and v' G F as above. Recall that ||M + A'^|| < nP' almost surely. 
Thus with probability 1 we have 

||(M + Af)(v - v')|| < n-^+^^+i+''. 

By the triangle inequality, we have 

||(M + Af)v'|| < 2n-^+'^^+^+T. 
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As usual, let Xi be the ith row oi M + N. It follows that there are at least 
n' •.= n — n^~^ coordinates 1 < i <n such that 

Now we relate the probabiHty that |X • v'| < n-'^+<^-'+V2+7+e ^ith where X := 
(xi, . . . ,Xn)- Consider the quantity 

|X • V - X • V'l - V[)XI + • • • • K - v'JXn\. 

Notice that \vi — v'^] < n~^~^^^'^^/'^ and also \xi\ < rf with probability one. 
Thus 

|X.v-X-v'| <n-^+^^+V2+7 
which implies, through the triangle inequality, that 

|X • V| < „--B+Cj+l/2+7 _^ < ^-S+C(j+l) + l/2^ 

where in the last inequality we used (38). 
We can then conclude that 

where in the last inequality we used the definition of Q.j^k- 

Also, a very crude second moment argument, using the fact that x has K-controUed 
second moment, gives 

(40) p„-B+co+i)+i/2^a,(v) <l-5' 

if 5' > is small enough depending on k. Thus 

P(|Xi • v'l < „-B+Cj+i/2+7+e-) < min(n*+^p, 1 - 6'). 

By the union bound, we thus have 

P (||(M + N)v'\\ < n-s+Cj+i+T+e-) < inin(n''+^p, 1 - 5')"' 
Again by the union bound, the left-hand side of (39) is at most 

^ exp(o(n))) min {n^+'^p, 1 - 5'Y' 

« „-n/2+enp-n^^5+e^yi' ^ exp(o(n))(l - 5')"' 

It is routine to verify that the last quantity is o{n^^) (indeed, we obtain a bound 
of the form 0(exp(— an)) for some a > 0). Our proof is complete. □ 
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11.5. The sparse case. Now we sketch the proof of Theorem 2.9. We repeat the 
above arguments with the following changes. We will of course replace Theorem 3.2 
by Theorem 3.3, with /x := p. Due to the presence of the additional factors of ji in 
that theorem, we can no longer afford to choose 5 close to 1/2, so we instead choose 
5 to be very small, say 5 = e, where e is very small compared to 1 — a. In order to 
take 5 this small, we will need B to be much larger than what (31) requires, but 
this is not a problem. For our applications, all we need is that B does not depend 
on n. 

The treatment of the poor vectors (Lemma 11.3) in the sparse case is the same 
as in the non-sparse case. The treatment of the rich vectors (Lemma 11.4) is also 
essentially the same, except for the fact that we no longer have (40). To be more 
precise, 1 — ^' needs to be replaced by 1 — 5' p. In the cases when k is larger than 
some absolute constant (say 5), (40) is not needed, since in this case p is sufficiently 
small and 



min(n''+''p, 1 - 5' p) = min(n^'^p, 1 - 5' p) = n^^p 

and the above argument goes through without difficulty, so long as one applies 
Theorem 3.3 with m := n^°^ for some sufficiently large absolute constant Cq. 

In the remaining case where k is at most 5, the replacement of 1 — ^' by 1 — 5'p 
becomes too expensive and we will avoid it by a rescaling argument, using the 
pigeonhole principle. 

To start, notice that from the definition of Qj^k and the fact A; < 5, we have 
(41) f„-B',xi,(v)>n-^^ 

for some fixed B — 0^(1) < B' < B. Since the left-hand side is pi^xipi^^ v), we 
also see from Lemma 4.4 that 

P^(n^'v) » n-^'. 

We observe that this implies that v is "compressed" in the sense that at most 
^woe I p q£ ^j^g coefficients of u = {v\,. . . ,Vn) can exceed in magnitude. 

(Of course, instead of 100, on can use any large constant.) Indeed, if instead we 
had at least n^^^^/p coefficients Vi of magnitude at least n"^'"*"^" for some large 
absolute constant A, we see from Lemma 4.5 that 

for one of these Vi, but one can show that this is not the case by a direct computation 

using the K-controUcd second moment hypothesis, or else by an appeal to Theorem 
6.6. (Here we used the notation of Lemma 4.5: {zY denotes a vector of length s 
whose every coordinate equals z.) 



THE CIRCULAR LAW 



35 



We have just seen that 

|{1 < i < n : \vi\ > n-^'+i°}| < n™7p. 

Next, we apply the pigeonhole principle to conclude the existence of a B" with 
B' - Oe(l) < B" <B' -10 and an integer m = Oe,^(l) with 

(42) n™^ < n^°°7p 

such that 

^(m-i)e < |{i < i < n : > n-^"+^°+^}| < |{1 < ^ < n : > n"^"}! < n""'. 

By paying a factor of Oe_^(l) in our final probability bound we may fix B" and 
m. If we define the vector w by setting Wi to be the nearest (Gaussian integer) 
multiple of +^ to Vi, we see that Wi is non-zero for at most n™^ coordinates i, 

and has magnitude ^ +10+7 ^^y. least n^™^^^'^ of these coordinates. Also, 
if II (M + N)v\\ < , we see from the triangle inequality and crude computations 
that II (M + 7V)w|| < n--^"+5+T (say), recalling that B" < B + 10. 

On the other hand, note that if we let Ij.p be independent samples of Ip, then 
with probability f2(n^™~^)^p), there is at least one i with li^p = 1 and Iwil = 
Q{n~^ +10+7^ Prom this we conclude that 

for some absolute constant 5' > (cf. (40)), and thus for each fixed w we have 
P(||(M + 7V)w|| < n-^"+^+T) «: exp(-f2(n(™-i)Vn)). 

On the other hand, a direct counting argument shows that the number of possible 
w is at most exp(0(n*^'""'"^)^)). Recall that p > n"^"*"" and a is much larger than 
£. It follows that 

for any m. Applying the union bound we obtain a suitably small contribution to 
the sparse analogue of Lemma 11.4, as required. 

12. Proof of the circular law 
We now use Theorem 2.5 to derive Theorem 1.2. 

By Lemma 2.4 and rotating x by a constant phase if necessary^, we may assume 
that X has /c-controlled second moment for some k. Allowing implied constants to 
depend on this k, we thus have that x has 0(l)-controlled second moment, which 
will allow us to apply Theorem 2.5 later. 



Here of course we use the obvious fact that the circular law is invariant under phase rotation 
of the underlying random variable x. 
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We closely follow the (now standard) arguments^ in [2, Chapter 10] (which are in 
turn based on the earlier work of Girko [9] and Bai [1]), which we briefly review 
here. 

Let c„ : R X R — > C be the characteristic function 

of the ESD /i, and similarly define 

c{u,v):= I I e^^-+^^yi^oo{dx,dy) 
Jn Jn 

of the uniform measure /ice on the disk. The sequence of empirical measures /u„ 

can be shown to be a.s. tight just from the assumption that x has finite second 
moment (see [2, Lemma 10.5] and [2, Theorem 3.6], and also the discussion in [2, 
p. 295]), and so by standard arguments it suffices to show, for almost every u,v, 
that 

(43) Cn{u,v) ^ c{u,v) 

almost surely. 

Henceforth we flx m, We can take uv ^0 since we only need the claim for almost 
every u,v. From [2, Lemma 10.2], we have the Stieltjes transform identity (first 
observed by Girko [9]) 

Cn(u,v) = = — / / - > -pT dsdt 



+ 



2V^U7r 

«2 + 



2V- 

where z := s+ V—^t, 



I I ^log\det{^Nr,-zIn)\e^'''+^^' dsdt 
Jn Jn ds sjn 

I 9n{s,t)e^^'+^^* dsdt 
Imtt JnJn 



gn{s,t):= — lj logx i'n{dx,z) 



and Vn is the empirical distribution of the positive-definite Hermitian matrix 

ff„ := (^iV„ - z/„)(^7V„ - zlr^r- 

The expression g„(s,t) is absolutely integrable in ,s,t, however because of the un- 
boundedness of logx, Fubini's theorem is not currently applicable, and one must 
take some care with interchanging integrals or derivatives in this expression. In [2, 
Lemma 10.4], the analogous identity 

c{u,v) = fl!'^ f f g{s,t)e^'^^+^'-Usdt 



^One could also follow the approach of Gotze and Tikhomirov [11], as was done in [19]. 
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is derived, where 5 is a function whose exphcit form (given in [2, p. 296]) we will 
not review here. The task is then to show that 



/ / {9n{s,t) - g{s,t))e^ 
Jr Jr 



a.s. 



The next steps in [2] are to perform some truncations in the region of integration. 
Let 5 > 2 be any integer. In [2, Lemma 10.6] (see also the discussion in [2, p. 299]), 
it was shown that 



lim limsup 

s^oo „_oo J J\s\>s or \t\>s^ 
a.s. Thus it suffices to show that 



ignis, t) - gis,t))e^ 



-lus+'y/ — \vt 



dsdt 



= 



i9n{s,t) - gis,t))e^^'+^-' dsdt ^ 

'\s\<S,\t\<S3 

a.s. for every S > 2. 

Fix S. For any £ > 0, let T C denote the set 

T:= {is,t) e R2||s| < S,\t\ < S,\\z\ - 1| > £} 
(recall that z := s + \/—lt). In [2, Lemma 10.7] it is shown that 

\gnis,t)\ dsdt < 2>l\fe 



II 

J J\s\ 



'\s\<S,\t\<S'^:{s,t)^T 

and similarly with g^is^t) replaced by g{s,t); thus it suffices to show that 



ign{s,t) - gis,t))e^'^'+^^* dsdt 



a.s. for each e > 0. 
Fix £ > 0. Recall that 



.) . 



gnis,t) := (^J logx Vnidx, 

In [2, Lemma 10.10] it is shown that 

d / f°° 

5(5,*) := ^ logx u{dx,z) 

where // is an explicit probability measure which we will not review here; in par- 



ticular, the inner integral is absolutely convergent. Set e„ 



-2B 



for some large 



absolute constant B (independent of n) to be chosen later. Using the integration 
by parts argument given in [2, §10.7], it suffices to show that 



(44) 

and 
(45) 



lim sup 



logx ivriidx, z) — vidx, z)) 



= 



lim sup 



I It Jo 



logx Vnidx, z) 
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and similarly with the two-dimensional integral on T replaced by one-dimensional 
integrals on the boundary of T. We shall only estimate the two-dimensional inte- 
grals, as the treatment of the one- dimensional ones are similar^. 

Wc first prove (44). Since x has finite second moment, a simple application of 
Chebyshev's inequality and the Borel-Cantelli lemma, and crude bounds on the 
spectral norm of Nn shows that almost surely Un is supported on the interval 
[0, 71^°°]. Thus it suffices to show that 



lim sup 



I / \ogx {vn(dx,z) — v{dx,z))\={) 



Observe that log x has total variation bounded by a finite multiple of log n on the 
X region of integration, thus it will suffice to show that 

(46) limsup(logn) supsup |i/„(a;, z) — v(x^z)\ = 0. 

n— >cx) zeT x>Q 

For this, it is convenient to perform some truncation, following [2, §10.5.1]. Let 
< (5 < 1/4 be arbitrary, and define the truncated random variables djj (depending 
on n) by 

ciij := aijl{\aij\ < n'^) - E (aijl{\aij \ < n*)) 
and the normalised random variable by 



ttij := aij/^E{\aij\^). 



One easily verifies that has mean zero, variance 1, and is also bounded by 0{n^) 
almost surely. Let 7V„ be the matrix with entries , let Hn be the positive-definite 
matrix 

if„ := (^iV„ - z/„)(^7V„ - z/„)* 

and let i'„(a;, z) be the distribution function associated to H„. The argument in [2, 
§10.5.1] gives the bound 

(47) limsupn'^^'' sup _L(i^„(-, z), !/„(•, z)) < oo 

almost surely for some absolute constant c > 0, where L denotes the Levi distance. 
Next, from [2, Lemma 10.15] we have 

(48) limsupn'' '''' supsup |!>ri(a;, z) — i^(a;, z)| < oo 

n-*cx 2:eTa;>0 

almost surely for another absolute constant c' > (this is where we use the hy- 
pothesis 8 < 1/4). Applying [2, Lemma 12.18] we conclude 

lim sup n"^ sup L{i>n{-, z), v{-, z)) < oo 

n— >oo zST 

and hence by the triangle inequality for Levi distance 

limsupn" ^'^ sup L{vn{-, z),v{-, z)) < oo 

n— >cx) zST 



Actually, by employing a smooth cutoff to T rather than a rough one, one can dispense with 
the need to consider boundary integrals. 
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for some c" > 0. Applying [2, Lemma 12.18] and [2, Lemma 10.8] we obtain 
limsupn"^ '''' sup sup )^'„(a;, z) — j/(a;, z)) < oo 

n— >(X) zeTx>0 

for some c'" > 0, which yields (46) (with some room to spare). This proves (44). 

The only remaining task is to prove (45). We would like to reduce matters to 
establishing that almost surely we have 



(49) 



lim 



logo; Vnidx, z) =0 



for almost every z. The Lebesgue dominated convergence theorem does not apply 
directly. However, observe from the triangle inequality in that 

- Vlog]Afe -z] 



/ / / logx Un{dx,: 
,J Jt Jo 



1/2 



= 2 



fe=i 



< 



k=l •' •'^ 



since logj^;) is locally square-integrable. From bounds on u (e.g. [2, Lemma 10.8] 
and the estimates used to prove [2, Lemma 10.10]), we also have 



T Js. 



log X ^{dx, z)\' 



1/2 



which by (46) implies that 



IT Je. 

is bounded uniformly in n. Thus 



(50) 



T Jo 



\0gX Vnidx, Z)\' 



log a; Vnidx, z)\'- 



< 00, 



1/2 



1/2 



is bounded uniformly in n, which implies that the sequence of functions J^" log x Vnidx, z) 
is uniformly integrable on T. 

Now we can deduce (45) from (49). To see this, let M > 1 be a large parameter, 
and let Tmm be the set of all z such that | Jg" log a; Vnidx, z)\ < M. Prom (49) and 
the Lebesgue dominated convergence theorem we have 



lim 



logx Vnidx, z) 



0. 



On the other hand, from the uniform boundcdncss of (50) we see that 



limsup / / / \ogx Vnidx, z) 

n^oo J JT\TM,n Jo 



1 

M 



Adding these two estimates, and then letting M — > oo, we obtain (45). 
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It remains to prove (49). By Fubini's theorem, it suffices to show for every z that 
(49) holds almost surely. But obsc;rve that the integrand in (49) vanishes whenever 
-^Nn — zin has least singular value at least n^^ . By Theorem 2.5, this holds with 

probability at least 1 — 0(n~^°°), if B is sufficiently large. The claim then follows 
from the Borel-Cantelli lemma. This concludes the proof of Theorem 1.2. 



We observe that the bound (46) was established with some room to spare. In fact, 
the arguments in [2] allow one to relax the condition Elxp"*""^ < oo to the slightly 
weaker condition 



for any sufficiently large constant C. By inspecting the arguments in [2], we see 
that any C > 16 will work. Perhaps a better constant can be obtained by tightening 
some calculations, but we do not try to pursue this direction. It seems to us that 
in the current approach, the extra log term cannot be removed completely in order 
to establish the full conjecture. 

We sketch the necessary changes to the argument as follows. The only part of 
the argument which needs any attention at all is the proof of (46) (the remaining 
components of the argument work even just assuming finite second moment for x). 
We fix 5 close to 1/4. We argue as before but with -q set equal to 77 ^ " , 
thus 77 now decays slowly in n. One easily checks using (51) that E|ip+'' is bounded 
uniformly in n by some bound B. We then use the arguments from [2] as before, 
noting that n'^ ^'^ will grow faster than log n if C is chosen large enough. Almost 
all of the arguments in [2] go through even when 77 depends on n. The one task 
which requires some care is the verification of (47). Following the arguments in [2, 
§10.5.2] to prove (47), everything goes through without difficulty except one step 
in the proof of [2, Lemma 10.13], in which one needs to establish that 



(in order to use the Borel-Cantelli lemma to neglect the contribution of I(||if„|| > 
^n^'^") for all z €T; note that the computation in [2, p. 311] here contains some 
minor typographical errors). Writing i?„ in terms of 7V„ and using the triangle 
inequality to dispose of the zIn terms, it suffices to show that 



13. Relaxation of the moment condition 



(51) 



E|xplog'^(2+ |a;|) < 00 





Using the moment method, we obtain the bomid 
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for any integer fc > 1. If we choose k := Ll^^^^J for some sufficiently large absolute 
constant K, then the factor (^n*'')"^'' becomes 0(n~-^°°). 

To conclude the argument, it suffices to show that 

(52) EtriiNnN*)'') <^B,K 0{n)''+\ 

This type of bound was established for bounded k in [2, Lemma 10.11] using the 
moment method. But it is well-known that the method extends to much higher 
value of k, in particular k — Ok ( i^'g f^g „ ) ■ Indeed, the left-hand side of (52) can be 
expanded as 



(53) ^ ] '^^iioiijii^---^ik3k^jkii- 

«i>---,»jcJivJfee{i, •••,"} 



To estimate this, we consider the closed walk of length 2k on the set {1, . . . , n} x 
{1, 2}, in which one walks from (ii, 1) to (ji, 2) to (^2, 1) to (j2, 2), and so forth to 
{ik, 1) to (jfc, 2) and then back to (ii, 1). If there is any edge traversed exactly once 
then the summand in (53) vanishes (since x has mean zero, and since all the 
are independent). Thus we may assume each edge is traversed at least twice, thus 
there arc at most k edges traversed, and thus at most fc + 1 vertices. Suppose in 
fact that there are I + 1 vertices traversed for some 1 <l < k. Then there are at 
least I edges traversed, and so the sum over all edges of the multiplicity minus two 
is at most 2fc — 21. Since a has a second moment of 0(1) and is bounded by 0{n^), 
we conclude that the summand in (53) is at most 0(l)'^0(n'*)^'^~^'. On the other 
hand, the number of closed walks of length 2fc in a set of 2n vertices which traverse 
exactly I + 1 vertices can be computed to be at most 

(see [8] or the introduction of [30]). 

Thus the total contribution to (53) can be bounded by 

The last [l — fc) term (which is the dominating term) is of order 0{l)^n^^^ , which 
is acceptable. As for the I < k terms, we can bound their contribution crudely by 



k— 1 

^0(l)'=0(fcn*)2'=-2'n'+i = o(n'=+i), 
1=1 

using the definition of k and the fact that 5 is small. Thus, this contribution 
is negligible compared to the main term. This proves (52), and completes the 
derivation of the circular law under the hypothesis (51). 
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14. Rate of Convergence 



Let us return to the original hypothesis of bounded (2 + 77)*'^ moment for some fixed 
7] > 0. The above arguments can be pursued in more detail to obtain the more 
quantitative result that with probability 1, we have 

(54) sup |/u„(s,t) - /Zoo(s,t)| < n"'' 

s,t 

for some r]' > depending on rj, and all sufficiently large n. 

A full exposition of this improvement would be very tedious, so we only give a 
brief sketch of how the argument proceeds. We first make some Fourier-analytic 
reductions, analogous to the proof of Weyl's equidistribution theorem, to reduce 
matters to controlling the characteristic function Cn{u,v). 

Firstly, from [2, Lemma 10.5] we have 

Applying the Kolmogorov law of large numbers, we conclude that with probability 
1 



for all sufficiently large n. In particular, assigns a measure of 0{n~^ ) to 

the complement of the square [—n^ /^/2,n^ /^/2]^ (one should think of this as a 
quantitative tightness estimate on /x„). If we then let djin be the push- forward of 
the measure d/Lt„ to the torus (R/n'' /^Z)^, and similarly define dfl^, it thus suffices 
to show that 

'^^P I /('R/„'('/2zP l[-ni'/2/2,s]x[-n'''/2/2,t](s')i')'^An(^'>*') 

|s|,|t|<ri'''/2/2 

~ /(R/n'''/2z)2 -'- [-n'''/2 /2,s] X [-nv'/^ /2,t] i^' ' t')djloo {s' ,t')\ 

Let be a bump function adapted to the ball iJ(0, ), and let ip : (R/n'' /-^Z)^ 
C be the Fourier series 

^(*'^)^=;^ E ¥'(w,z;)ev^-+v^''*. 

«,'ye27TO-'''/2z 

This is an approximation to the identity, and one can then verify the pointwise 
bounds 

l[_„','/2/2,s]x[-Tii'/2/2,t] > l[-n'''/2/2,s-n-5')']x[-Ti'''/2/2,t-„-5^'] * (p + 0{n ) 

and 

l[_„^'/2/2,s]x[-n')'/2/2,t] < l[-„i'/2/2,s+n-5''']x[-Ti')'/2/2,t+„-5T,'] *<^ + 0(n ) 
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(say). Because of this, and the fact that /i„, jloo are probability measures, it will 
suffice to show that 

^^P I I(B./ni'/'^Z)2 l[-ri'7'/2/2.s]x[-n'7'/2/2,t] * 'f>{s' ,t')djln{s' ,t') 

|s|,|t|<Ti'''/2/2 

~ /(R/n»)'/2Z)2 l[-n'''/2/2,s]x[-ri'''/2/2,t] * ^^(s') ^O'^Moo («',*') I 

Taking Fourier transforms and using the triangle inequality, we can bound the 
left-hand side by 



1/ e^'''+^^^'dfLn{s,t)- I e^"^+^-*dMoo(s,i)|, 



n^oo"' sup 

u,ve2vn-'n'/^Z:\u\,\v\<S.n^°n' ^(R/ni'/2z)2 ^(R/7ii'/2z)2 

which is equal to 

n^°°'' sup \cn{u,v) — c{u,v)\. 

u.ve^nn-"' /^Z:\u\,\v\<^n^°^' 

Thus it will suffice (by the union bound and the Borel-Cantelli lemma) to show 
that for any fixed u,v with \v\ < , one has 

(55) P(|c„(«,«) - c{u,v)\ « 71-2°°"') > 1 - 0(n-^°) 

for all sufficiently large n. 

To prove (55) one repeats the proof of (43), which requires going through all the 
relevant arguments in [2] and noting that all the almost sure convergence results can 
be replaced instead with more quantitative polynomial convergence results (similar 
to (55)). We perform only one of these steps in detail, namely the proof of the 
quantitative analogue of (45), 



{dx, z) 



«n-200''')>l-O(n-i0). 



Inspecting the proof of (49), we see that for each fixed z, Jg " log a; Vn{dx, z) vanishes 
with probability 0(n^^*'°). By Fubini's theorem and Markov's inequality, we thus 
see that with probability 1 — 0(n~^°), the set {z G T : J^" logx Vnidx, z) ^ 0} has 
measure at most n~^°. Since (50) is bounded uniformly in n, the claim now follows 
from the Cauchy-Schwarz inequality. 

Remark 14.1. It is quite likely that one can make the convergence even more quan- 
titative, establishing a bound of the form 

P(sup |Mn(s, t) - Moo(s, t)\ < n-"') > 1 - ©(n-i-"') 

s,t 

for all n > 1; note that the claim (54) is a corollary of this bound and the Borel- 
Cantelli lemma. This requires replacing the Kolmogorov law of large numbers with 
a more quantitative law of large numbers which takes advantage of the fact that 
the random variable \x\^ does not merely have finite first moment, but in fact has 
finite (1 + |)*^ moment. We omit the details. 
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15. The sparse case 



In this section we sketch how one can modify the arguments in Section 12 to obtain 
the circular law for sparse matrices (i.e. Theorem 1.3). The proof shall be a modi- 
fication of that^ of Theorem 1.2. In that theorem, one first needed the convergence 



1 " 



lim — ■ y Ixjfcp = E|a;p < oo, 



which was a consequence of the Kolmogorov law of large numbers, in order to obtain 
tightness of the \Xn- In the sparse case, the analogous convergence result one needs 
is 

1 " 

(56) lim — - fc pX^-fel^ = E|a;|^ < oo. 

j,k=l 

But one easily computes that with probability 1, Ij.k.p is equal to 1 for (l + o(l))pn^ 
values of j, k, and so this claim also follows from the Kolmogorov law of large 
numbers. 



We now repeat the arguments of Section 12, using Theorem 2.9 instead of Theorem 
2.5. The truncation argument in [2, §10.5.1] which allows one to replace Vn with 
Dn can be easily modified, basically by similar arguments to the one used to deduce 
(56). The only step which requires care is the modification of [2, Lemma 10.15] 
needed to establish the sparse analogue of (48). The proof of this lemma in [2] 
requires some upper bounds for the expected moments Etr(iJ^) of Hn (see [2, 
Lemma 10.11]), but it is not difficult to verify'' that these upper bounds continue 
to hold in the sparse case. The rest of the proof of [2, Lemma 10.15] proceeds with 
only minor changes. 
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It is also likely that the arguments in [11] (see also [19]) could also be adapted to handle this 
case, at least if one assumes additional moment conditions on x, since the lower bound a > 3/4 
required in that paper was only needed to obtain an ajialogue of Theorem 2.9. 

^As is well-known, the expected moments reduces to a sum over paths of length k, such as (53). 
For those paths in which each edge is traversed exactly twice, there is no difference between the 
sparse matrix and dense matrix as far as the expectation is concerned. For those paths in which 
an edge is traversed more than twice, the sparse matrix contributes more than the dense matrix, 
but one can still show that the net contribution here is dominated by the main term in which 
each edge is traversed exactly twice; roughly speaking, for each fewer vertex that one traverses, 
one loses a factor of / p but picks up a factor of n, leading to a net gain of a positive power in 
n. 
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